Estimating fuzzy measures of deprivation at local level in Tuscany

In this paper we estimate monetary and non-monetary poverty measures at two sub-regional levels in the region of Tuscany (Italy) using data from the ad-hoc Survey on Vulnerability and Poverty held by Regional Institute from Economic Planning of Tuscany (IRPET). We estimate the percentage of households living in poverty conditions and three supplementary fuzzy measures of poverty regarding deprivation in basic needs and lifestyle, children deprivation, and financial insecurity. The key feature of the survey is that it was carried out after the COVID-19 pandemic, therefore, some of the items collected focus on the subjective perception of poverty eighteen months after the beginning of the pandemic. We assess the quality of these estimates either with initial direct estimates along with their sampling variance, and with a secondary small area estimation when the formers are not sufficiently accurate.


Introduction
The COVID-19 pandemic outbreak has affected all segments of the population and has been particularly detrimental to members of social groups in the most vulnerable situations. The pandemic has created both a public health crisis and a severe crisis on both the global and national economies and continues to affect populations especially in economic and social areas. Some recent studies have shown that not all the EU felt the pandemic impact on their economies to the same extent: the southern European countries like Spain, Croatia, Greece, and Italy, where the tourism sector plays a relevant role, were the most fragile (EU 2021).
In this paper, we study the economic poverty at regional and sub-regional level after the impact of COVID-19 pandemic in Tuscany, a region in central Italy that heavily founds its economy on exports and various forms of tourism. Tuscany is also a region that, in the face of an underlying cultural homogeneity, has a variety of natural and human environments, as well as structural and economic contrasts across different areas. 1 Thus, if we only look at regional level, large intraregional disparities are going to be masked.
Understanding poverty at local level is then essential to adequately identify this phenomenon, and to design local policies that aim at mitigating its consequences. Therefore, considering the heterogeneity of the regional territory, we think that it is particularly useful to analyse the phenomenon of economic poverty at sub-regional level. To this purpose, we use data from the Poverty and Vulnerability survey held by Regional Institute for Economic Planning of Tuscany (IRPET) in September 2021.
We consider two hierarchies of sub-regional disaggregation. The first sub-regional level that we consider is a partition of Tuscany into six areas, officially defined by IRPET, obtained as an aggregation of the so-called Local Labour Market Areas. Such grouping of six different areas, refers to the levels of employment, the remuneration of productive factors (labour and capital) and consequently, to different levels of wellbeing. In detail, the six areas are the following: Cities, the urban territories, with an important presence of the tertiary sector to businesses as well as to persons; Made in Italy, manufacturing areas based on the traditional production vocations of textiles, leather, leather goods, furniture, etc.; Other Industry, manufacturing areas not belonging to the Made in Italy sector; Seaside Tourism, coastal territories having a seasonal tourist characterization; Agritourism area, promoting sustainable agriculture and ecological tourism; Internal Areas Northern Apennines, the farthest area from centres, lacking of essential services. The second sub-regional level is the official NUTS-3 level.
When referring to domain measures of poverty and deprivation, we believe that it is important to understand poverty beyond monetary deprivations and to provide reliable evidence to monitor specific domain policy. Therefore, we use the survey data to estimate the percentage of households living in poverty conditions and three fuzzy supplementary measures of poverty based on a fuzzy approach. This approach has a longstanding usage in the literature, see ) for a review in social studies and (Tavares and Betti 2021) for the effect of COVID-19 on poverty in Brazil.
Given the importance of the results for regional policy making, we provide the estimated measures with an estimate of their uncertainty. When the uncertainty around the estimate is too large to be regarded as acceptable, we use small area estimation to obtain more accurate estimates. Empirically, we find out that in some areas/provinces, the estimated coefficient of variation is too large to get a proper level of accuracy of the direct estimates. Nevertheless, we obtain satisfactory results using small area estimation techniques.
The remainder of this paper is structured as follows: Sect. 2 discusses the survey and gives a preliminary regional picture after COVID-19 breakout in Tuscany focusing on preliminary monetary poverty measures and providing a general picture of the perceived economic situation of the inhabitants. Section 3 introduces the fuzzy measures of multidimensional poverty. It also introduces small area techniques as a tool to obtain more efficient estimates of poverty measures when the sample size is not enough to guarantee an adequate level of accuracy. Section 4 describes the empirical analysis and the results obtained at area and NUTS-3 level. It shows both direct and small area estimates. Section 5 concludes the paper and provides directions for further research work.

The survey and the regional scenario
The sample survey "Indagine sulla Vulnerabilità alla Povertà" was conducted in September 2021 by the Regional Institute for Economic Planning of Tuscany. Its focus is on the economic and social features of the Tuscan households, with particular attention to the current economic situation and prospects. A sample size of 2512 households has been drawn to achieve representativity at NUTS-3 level. Interviews were conducted by C.A.T.I. 2 and C.A.M.I. 3 methods, interviewing one adult household member. After a weighting procedure, the sample totals conform to the population totals as regards gender and age groups. As regards item nonresponses, missing data have been imputed by deductive imputations based on logical or mathematical relationships between the variables, where it was possible. Thirteen units having missing values for all the eleven deprivation indicators collected for the present situation and for the pre-COVID were discarded. Therefore, the valid units for the analysis are 2499. Item nonresponses relative to some quantitative and qualitative variables were imputed with stochastic imputation methods, assuming fully conditional specification. 4 The largest number of missing values (14,5%) was registered for the only question adopted to collect the approximative monthly total net household income. . Continuous values within each bracket have been imputed considering the kernel density estimate of the empirical distribution of the variable. Based on the total household disposable income, we retrieved the equivalized income using the OECD-modified equivalence scale. The poverty line was taken as the 60% of the median of such equivalised income distribution among the 5523 individuals present in the valid 2499 interviewed households.
According to IRPET 5 the households in absolute poverty in Tuscany went from 3.2% to 3.3% (a negligible increasing) thanks to the interventions put in place to protect families to contain the effects of the pandemic. Referring to a relative measurement approach and using the Eurostat-type poverty line, we estimated the head count ratio at regional level to be equal to 11.58%. 6 In order to have a general picture of the perceived economic situation of the inhabitants, Table 1 shows some descriptive results at regional level. In particular, the data collected from the question: "Taking into account your actual income, how can your household make ends meet? With great difficulty/some difficulty/difficulty/fairly easily/ easily/very easily?" (Ravallion 2014(Ravallion , 2015 show that more than half of the households (53,06%) make ends meet facing at least with "some difficulty".
Moreover, Table 2 shows that for about 33% of the households the economic situation at least "slightly worsened" with respect to the pre-pandemic period (2019).
As of what the households expect for the coming twelve months, by analysing the distribution of the households expectations, conditioned to the "ability to make ends meet", we notice that the difficulties to make ends meet increase as it does the percentage of households expecting worsening for the next months (see Fig. 1). Thus, the actual difficulties, even if strongly influenced by the contingent situation of the pandemic are perceived as a middle/long term situation.
Although monetary poverty can capture a household's ability to meet critical situation like the pandemic one, surely it does not capture all forms of deprivation. For  this reason, our analysis considers also non-monetary multidimensional measures of poverty.

Fuzzy measures of multidimensional poverty
The fuzzy sets approach is a valid instrument to measure multidimensional poverty and it offers the additional advantage of overcoming the use of unavoidably arbitrary poverty thresholds. In such a way, it avoids extreme simplifications and loss of statistical information, deriving from the rigid poor/non poor dichotomy. The fuzziness is accounted for via a poverty membership function, measured on a scale from 0 to 1-whereby 1 means full membership to the set of the poor and 0 full non-membership-which allows to deal with such a blurry (as opposed to sharp) vision of the poverty concept. The conventional classification into a rigid dichotomy according to the traditional approach to poverty can then be viewed as a special case of the fuzzy conceptualization of poverty, where the membership function equals 1 for those below the poverty line and 0 for those above the poverty line.
There are several advantages of treating poverty and deprivation as a matter of degree, applicable to all members of the population, rather than as simply a 'yes-no' state. First, Nonmonetary deprivation depends on forced non-access to various facilities or possessions determining the basic conditions of life. An individual may have access to some but not to others. Hence, non-monetary deprivation is inherently a matter of degree, and some quantitative approach such as the present one is essential. Second, further insights into the relative income situations of individuals and groups can be obtained by incorporating into the poverty rates a measure of the actual levels of incomes received, particularly at the lower end of the income distribution. Third, the fuzzy approach provides more robust indicators of poverty (or more generally, of deprivation in multiple dimensions) in the longitudinal context. The conventional approach measures mobility, simply in terms of movements across some designated poverty line, does not reflect the actual magnitude of the changes affecting individuals at all points in the distribution. Consequently, the degree of mobility of persons near the chosen line tends to be over-emphasised, while that of persons far from that line largely ignored. Betti and Verma (2008) proposed the following fuzzy measures based on the seminal contributions of (Cerioli and Zani 1990;Cheli and Lemmi 1995); then furtherly elaborated in (Betti et al. 2015). In the generalised form, the membership function of monetary or non-monetary deprivation is defined for any individual of rank j in the ascending income distribution as: where s is the overall score in the non-monetary deprivation (defined below), w γ is the sample weight of individual of rank γ and is a parameter to be estimated on the data (see step 6 below). It is possible to reformulate formula (1) as: where 1 − F j,K is the proportion of individuals less poor (less deprived) than the person concerned and L j,K represents the value of the Lorenz curve for individual j.
As regards the computation of the fuzzy supplementary (FS) measures, the general procedure to quantify and put together diverse indicators of deprivation is based on the following steps: 1. Identification of items of deprivation to be included in the analysis. 2. Transformation of the items in the [0, 1] interval. 3. Exploratory and confirmatory factor analysis to identify measures of deprivation. 4. Calculation of weights of individual items of deprivation within each dimension. 5. Calculation of scores within each dimension.
Calculation of an overall score and parameter . Construction of the fuzzy deprivation measures separately in each dimension, taking their simple average as a measure of overall non-monetary (supplementary) deprivation.
We remind to Betti et al. (2015) for more details on the steps 2 and 4 above. Broadly speaking, Step 2 maps categorical items to the [0,1] interval using the distribution function of the item. We do not need this step because the items that we consider are already in [0,1].
Step 4 calculates a weight for each item based on the coefficient of variation of the transformed item (Step 2) and the correlation of the item with those in the same dimension found in Step 3. Let s j,h be the score of the unit i computed for for each dimension h in step 5. The overall score s j is calculated as a simple average over the m dimensions As in step 2, to transform a generic item into the [0, 1] interval we remind the reader to (Betti et al. (2015). In the case of dichotomous items, like those used in this paper, the deprivations score s is 0 for non-deprivation and 1 otherwise.
As of step 6 and the calculation of the parameter this is done by solving the non-linear equation: where k denotes the expectation of the fuzzy membership function (in dimension k) with respect to the probability measure induced by the sampling design and Ĥ CR is the estimated Head Count Ratio using survey data.

Small area estimation
Sample surveys are widely used in practice to provide estimates not only for the whole target population of interest, but also for a variety of its subsets or subdomains. These can be either geographical like areas, counties, districts, or any sub-population, such that the survey is usually designed to be representative at a higher hierarchy level. In general, it is common to refer to estimates that use only sample weights as direct estimates, that is, a direct estimator is design-based. Sometimes though, these direct estimates may suffer from high variance if the number of sample units in the subdomain is not large enough. The term small area is used in the literature to refer to these subdomains. The idea behind Small Area Estimation (SAE) is to borrow strength from auxiliary variables to obtain indirect estimators that may exhibit a lower mean squared error than that of the direct estimator. Many small area models have been proposed in the literature (Rao and Molina 2015), in this paper we make use of the Fay-Herriot model (FH) (Fay and Herriot 1979). The setting is as follows: let ̂ i be an unbiased direct estimator of the i-th area parameter i so that where M is the number of small areas, z i is a vector of area level covariates and v i ∼ N(0, 2 v ) , so that by combining the two equations above we obtain The Best Linear Unbiased Predictor (BLUP) estimator of i is then given by and ̃ is the Best Linear Unbiased Estimator (BLUE) of . As the BLUP estimator depends on the unknown parameter 2 v , the empirical BLUP (EBLUP) is obtained by replacing it with a consistent estimator ̂ 2 v , so that the EBLUP estimator of ̂ i turns out to be where ̂ is the BLUP estimator of having plugged in the estimator of 2 v . Thus, the small area estimator under the FH model is a linear combination of the direct estimator ̂ i and the synthetic estimator from the model in Eq. (8), where greater the variance of the direct estimator greater the weight attached to the synthetic estimate. Concerning the details about the estimation of the mean squared error of the FH estimator, for the sake of brevity of exposition, we limit to say that there exists substantially to approaches, namely: direct computation or the bootstrap. For more details on the two we remind the reader to (Rao and Molina 2015).

Measurement issue: fuzzy measures computation
To compute the fuzzy supplementary measure of poverty, we consider eleven binary deprivation indicators focusing on households' current situation (September 2021). 7 The indicators are based on standard questions regarding: affordability to eat nutritional meals, to keep household adequately warm, to cover costs for health, to cover costs for 1 week holiday, to cover costs for cinema, theatre, eating out once a month, to cover costs for transport, for children clothes, toys, specific children food); to cover costs for education such as taxes, books and materials and finally and then ability to cope with unexpected expenses of different amount.
The frequency distribution of these items is shown in Fig. 2. Among these, we notice that more than 1000 households (about 42%) cannot afford a one-week holiday. Also, most households cannot afford an unexpected 5000€ expense. Table 3 shows the matrix of tetrachoric correlations between items. Although, the correlations are moderate in general, the group of items that regard the possibility to cope with unexpected expenses are highly correlated. Also, it is reasonable that households that tend to spend more in Children care are also those that have higher expenditure in education. Interestingly, there is also a substantial correlation between the possibility of affording one-week holyday and expenditure in recreative activities. As of step 2 of constructing FS measures, there is no need to rescale the items, as they are already in [0,1].   The dimensions of deprivation have been further investigated by an exploratory factor analysis (see Table 3) to identify the hidden dimensions of the multidimensional poverty (step 3). The three factors extracted account for 53% of the total variance.
The Kaiser-Meyer-Olkin measure of sampling adequacy (0.81) indicates that the factor analysis method is suitable for the collected data (Table 4).
Successively the latent structure identified has been validated using a confirmatory factor analysis (CFA). The interpretation of these results is shown in Table 5. The dimension (FS3) refers to "Inadequate Basic needs and non-inclusive lifestyle". Indeed, the indicators that mostly contribute to this dimension, all refer to the lack of possibility of satisfying basic needs and the possibility to live with an inclusive lifestyle.
The second dimension shows high factor loadings for items regarding expenditure for children needs and education. Therefore, we interpret it as an indicator of "Children specific deprivation" (see Carraro and Ferrone 2020;Benedetti et al. 2020 for related studies on this topic). The dimension FS1 involves expenditure inability to cope with unexpected expenses, therefore we interpret it as an indicator of "Financial insecurity". Indeed, we use to say that households are financially insecure if they have not enough assets to face an event that decreases incomes or increases expenses (Prieto 2022).
Overall, the item complexity of each item is close to 1 (below two for 10 out 11 items) suggesting that the items reflect approximately one construct each. The only item that has an item complexity value above two is that of covering costs of transports. The reason of this may be in that this item is approximately correlated to all other items in the same extent, so that it is less clear to what dimension it belongs more.
Then, we calculate weights (step 4) and scores (step 5) withing each dimension and solve the non-linear equation in Formula 4 using the expectation with respect to the probability measure induced by the sampling design (step 6). This step is done using the average in Formula 3. Having obtained the value of , we use it to calculate Formula 2 in each dimension separately.

Measurement issue: small area level estimation
As said in the introduction of the paper. we consider the IRPET's territory subdivision of Tuscany into the six areas (Table 6) and NUTS-3 (Table 7). These tables suggest that in some domains the sample size is likely too small to produce reliable estimates at local level. In fact, the reliability of the estimated parameter is often related to the sampling variability.
The results from direct estimation shown in the section below confirm this intuition and, consequently, the need to use SAE methods. To apply such methods, we first estimate sampling errors for each domain using the bootstrap. Next, we review possible (local) data sources suitable to find auxiliary variables at small area level. The auxiliary variables that we use (see Formula 7) come from administrative data source (IRPET), therefore they are measured without error (see Arima et al. 2015;Bell et al. 2019 for a dissertation on when covariates are measured with errors).
To assess the accuracy of the results based on the survey data we use the coefficient of variation as it is a standardized measure of the sampling variability. 8 Statistics Canada, 9 provides guidelines for publication related to the uncertainty of estimates specifying the following levels of data quality: excellent (0-5%); very good (5-15%); good (15-25%); acceptable (25-35%); (> 35%) use with caution. Nevertheless, many Official Statistical Agencies do not publish estimates with CV higher than 20%.

Area level estimation
This section reports the estimates of the monetary and non-monetary poverty measures at area and NUTS-3 level using the survey data. The estimation of the percentage of households living in poverty conditions has used the imputed income described in Sect. 2. All SAE estimates are compared with the corresponding direct estimates either in terms of their point estimates and coefficients of variations (CVs). Small area etimates were obtained using the sae R-package (Molina and Marhuenda, 2015).
Starting with the six areas, the estimated HCRs (Fig. 3) show that "Cities" and "Made in Italy" have less households living in poverty conditions while "Agritourists" and "Internal Areas" are the poorest areas. Interestingly, the SAE estimate of "Internal Areas" and "North-Apennines" revises downwards the direct estimates. Figure 4 shows that the gains in efficiency of the EBLUPs tend to be larger for areas with smaller sample sizes. Thus, EBLUPs based on FH model seem more reliable than direct estimators. To obtain these estimates we used the following auxiliary variables: the percentage of people employed (for HCR). and the percentage of people receiving citizenship retirement benefits (for FS1. FS2. FS3).
Financial insecurity (FS1) is the most dominant dimension of poverty in all areas considered (Fig. 5), and it is followed in a less extent by children-specific vulnerability. Interestingly, the "Cities" area is the one that experiences much less the dimension of financial insecurity, maybe, this is due to that households have enough assets to face an event, like the pandemic, that in many cases decreases incomes. This result is coherent with the HCR (Table 8) as for "Cities" this is much lower than in other areas. Figure 6 show that we obtain the major gains in efficiencies in the areas having a lower sample size although in the most sampled areas. The mean squared error of the small area estimate is sometimes larger than that of the direct estimate. However, this is not necessarily a problem as the estimates in these areas have good quality. Table 8 reports the final estimates for the areas considered along with their root mean squared errors. For each area, we report the most efficient estimate between the direct estimate and the model-based estimate.

NUTS-3 level estimation
As regards NUTS-3 level results, we notice (see Table 7) that Prato and Massa-Carrara are the provinces with the smallest sample sizes, while Florence and Pisa have the two greatest ones. In Fig. 7, the estimated HCR is tracked, and we can observe that EBLUPs track direct estimators but are much less volatile. According to the EBLUP HCRs we observe that Siena, Firenze, Prato and Arezzo show figures below the regional HCR and that Massa and Grosseto present much greater figures. At this level of territorial disaggregation. Figure 8 shows that only three provinces (Prato, Arezzo, and Siena) have HCR direct estimates with CVs over 25% (acceptable), whereas the CVs of the SAE estimates do not exceed 22% for any of the areas (very good or good quality).
Regarding non-monetary poverty (see Fig. 9), financial insecurity (FS1) is the dimension that dominates in all the provinces considered. In this dimension, the less deprived province is Prato, followed by Arezzo. The ten provinces show similar deprivation measures as regards FS2. FS3 is a dimension that shows the lowest values and interestingly, it is particularly contained in the provinces of Arezzo and Prato. These provinces together with Florence are at the same time those that have the lowest values of the monetary poverty (Table 9). Moreover, in Florence, where the tourism sector plays a key role, the pandemic impacted more than in Arezzo and Prato with economies largely based on manufacturing sector.
Regarding the FS measures, Fig. 10 shows the CVs of the direct estimates and the small area estimates. All the non-monetary poverty measures are estimated with a lower error in all the provinces considered with exception of Florence. The reason could be the variability of the indicators in Florence, indeed the variance of estimation is a function of the estimation itself, so that we expect larger uncertainty when the estimation focuses on rare events  (Wolter 2007). However. the CVs of EBLUPs do not exceed 23% for any of the provinces (very good or good quality).
The auxiliary variables that we used to obtain small area estimates are the weighted 10-th percentile of the income distribution of the total income distribution (for HCR), and the percentage of households owning the house where they live (for FS1, FS2, FS3). Table 9 shows the final estimates at NUTS-3 level. As we did before, we report the most efficient estimate between the direct and the model-based estimate.

Conclusions
In this paper, we estimated monetary and non-monetary poverty measures at two different levels of disaggregation in Tuscany. The data comes from a sample survey held by the Regional Institute for Economic Planning of Tuscany (IRPET), focusing on the economic  and social features of the Tuscan households, with particular attention to the current economic situation (September 2021) and prospects.
In particular, we estimated the percentage of households living below the poverty line and three (fuzzy) supplementary measures of poverty that we identified in: deprivation in basic needs and social inclusion, children specific deprivation, and financial insecurity. Our results reveal the areas and the provinces with the major amount of people living in (monetary) poverty conditions, but also, they show that the perception of financial insecurity is dominating at each level of territorial disaggregation. Also, it reveals that children vulnerability is a problem that is similarly spread in Tuscany, while poverty in basic needs seems to have hit some areas/provinces more than others. Thus, measures related to monetary issues, as HCR and FS1 are those presenting more heterogeneity across provinces and across the six areas.
The importance of these findings is crucial from the point of view of local policy making, indeed specific addressed policy could reduce regional vulnerability and/or foster the capacity for resilience (Sánchez and Jiménez-Fernández 2022).
Having accurate measures at small area level is a crucial comprehensive information base for stakeholders and policy makers that can adopt the results as a starting point for the design of policies for the poor and vulnerable groups with the aim of i) addressing the gaps in social protection of Tuscan citizens and ii) making the regional welfare system more resilient to future shocks through more effective and flexible policies.
For these reasons we estimated the variance of the measures discussed as an assessment of their reliability. We noticed that in some areas/provinces, the estimated coefficients of variation were not so contained to guarantee a proper level of accuracy of the direct estimates.
For this reason, we used small area estimation to obtain new estimates with lower mean squared error. With this technique we were able to take almost all the estimates at a level more than acceptable. To state if an estimate is acceptable, we used the guidelines by Statistics Canada. However, some studies apply small area estimation regardless of the magnitudes of CVs (see Graf et al. 2019 as example).
Nevertheless, for some areas the small area gains were not always sufficient. These results may be in part because the number of areas considered is low so that the variance resulting from small area estimation may be overestimated. In principle, this is not a major problem if the variance is estimated in a conservative fashion. However, when the areas/ province show a good level of accuracy we use the direct estimates as an official estimate. In fact, the properties of direct estimators are well-known while small area models may suffer from bias when the number of areas is small. We also noticed that in provinces with large sample size, the small area error tends to be greater. Situation like this have been already addressed in the literature and seem to be an effect of the large sample size.
Funding Open access funding provided by Università degli Studi di Siena within the CRUI-CARE Agreement. The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Declarations
Competing interests The authors have no relevant financial or non-financial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.